Gaussian dynamic warping (GDW) method applied to text-dependent speaker detection and verification
نویسندگان
چکیده
This paper introduces a new acoustic modeling method called Gaussian Dynamic Warping (GDW). It is targeting real world applications such as voice-based entrance door security systems, the example presented in this paper. The proposed approach uses a hierarchical statistical framework with three levels of specialization for the acoustic modeling. The highest level of specialization is in addition responsible for the modeling of the temporal constraints via a specific Temporal Structure Information (TSI) component. The preliminary results show the ability of the GDW method to elegantly take into account the acoustic variability of speech while capturing important temporal constraints.
منابع مشابه
Improvement of speaker verification for Thai language
There are many strategies proposed for speaker verification (SV) system, both in text-dependent (fixed-text) and textindependent (free-text) domains. To convey an appropriate algorithm for Thai speech, several consecutively improvement methods are compared in this paper including the dynamic time warping (DTW) matching and Gaussian mixture model (GMM) based systems. We firstly developed a syste...
متن کاملUsing phoneme recognition and text-dependent speaker verification to improve speaker segmentation for Chinese speech
Speaker segmentation is widely used in many tasks such as multi-speaker detection and speaker tracking. The segmentation performance depends on the performance of speaker verification (SV) between two short utterances to a large extent, so the improvement of the SV performance for short utterances would give the segmentation performance a great help. In this paper, a method based on phoneme rec...
متن کاملSpeaker Recognition Using Gaussian Mixtures Models
Speech signal contains several levels of information. At first it contains information about the spoken message. At second level speech signal also gives information about the speaker identity, his emotional state and so on. The task of speaker recognition can be divided into two parts: speaker identification and speaker verification. Speaker identification is answering the question which one o...
متن کاملFrame-level Nonlinearity for Robust DTW-based Speaker Verification
Dynamic time warping (DTW) is a successful algorithm in many matching and searching tasks. For the text-dependent speaker verification, it is still an appropriate choice when enrollment data are very limited. Yet DTW is very sensitive to the endpoint variations between the reference template and test examples. Most research reported on this issue is mainly in two directions: robust endpoint det...
متن کاملFeature and score normalization for speaker verification of cellular data
This paper presents some experiments with feature and score normalization for text-independent speaker verification of cellular data. The speaker verification system is based on cepstral features and Gaussian mixture models with 1024 components. The following methods, which have been proposed for feature and score normalization, are reviewed and evaluated on cellular data: cepstral mean subtrac...
متن کامل